10 research outputs found

    General theory for stochastic admixture graphs and F-statistics

    Get PDF
    We provide a general mathematical framework based on the theory of graphical models to study admixture graphs. Admixture graphs are used to describe the ancestral relationships between past and present populations, allowing for population merges and migration events, by means of gene flow. We give various mathematical properties of admixture graphs with particular focus on properties of the so-called FF-statistics. Also the Wright-Fisher model is studied and a general expression for the loss of heterozygosity is derived

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Theory and inference on gene flow and ploidy numbers from NGS data

    No full text

    Powerful Inference With the D-Statistic on Low-Coverage Whole-Genome Data

    No full text
    The detection of ancient gene flow between human populations is an important issue in population genetics. A common tool for detecting ancient admixture events is the D-statistic. The D-statistic is based on the hypothesis of a genetic relationship that involves four populations, whose correctness is assessed by evaluating specific coincidences of alleles between the groups. When working with high-throughput sequencing data, calling genotypes accurately is not always possible; therefore, the D-statistic currently samples a single base from the reads of one individual per population. This implies ignoring much of the information in the data, an issue especially striking in the case of ancient genomes. We provide a significant improvement to overcome the problems of the D-statistic by considering all reads from multiple individuals in each population. We also apply type-specific error correction to combat the problems of sequencing errors, and show a way to correct for introgression from an external population that is not part of the supposed genetic relationship, and how this leads to an estimate of the admixture rate. We prove that the D-statistic is approximated by a standard normal distribution. Furthermore, we show that our method outperforms the traditional D-statistic in detecting admixtures. The power gain is most pronounced for low and medium sequencing depth (1–10×), and performances are as good as with perfectly called genotypes at a sequencing depth of 2×. We show the reliability of error correction in scenarios with simulated errors and ancient data, and correct for introgression in known scenarios to estimate the admixture rates

    Qualitative analysis of tumor-infiltrating lymphocytes across human tumor types reveals a higher proportion of bystander CD8+ T cells in non-melanoma cancers compared to melanoma

    No full text
    Background: Human intratumoral T cell infiltrates can be defined by quantitative or qualitative features, such as their ability to recognize autologous tumor antigens. In this study, we reproduced the tumor-T cell interactions of individual patients to determine and compared the qualitative characteristics of intratumoral T cell infiltrates across multiple tumor types. Methods: We employed 187 pairs of unselected tumor-infiltrating lymphocytes (TILs) and autologous tumor cells from patients with melanoma, renal-, ovarian-cancer or sarcoma, and single-cell RNA sequencing data from a pooled cohort of 93 patients with melanoma or epithelial cancers. Measures of TIL quality including the proportion of tumor-reactive CD8+ and CD4+ TILs, and TIL response polyfunctionality were determined. Results: Tumor-specific CD8+ and CD4+ TIL responses were detected in over half of the patients in vitro, and greater CD8+ TIL responses were observed in melanoma, regardless of previous anti-PD-1 treatment, compared to renal cancer, ovarian cancer and sarcoma. The proportion of tumor-reactive CD4+ TILs was on average lower and the differences less pronounced across tumor types. Overall, the proportion of tumor-reactive TILs in vitro was remarkably low, implying a high fraction of TILs to be bystanders, and highly variable within the same tumor type. In situ analyses, based on eight single-cell RNA-sequencing datasets encompassing melanoma and five epithelial cancers types, corroborated the results obtained in vitro. Strikingly, no strong correlation between the proportion of CD8+ and CD4+ tumor-reactive TILs was detected, suggesting the accumulation of these responses in the tumor microenvironment to follow non-overlapping biological pathways. Additionally, no strong correlation between TIL responses and tumor mutational burden (TMB) in melanoma was observed, indicating that TMB was not a major driving force of response. No substantial differences in polyfunctionality across tumor types were observed. Conclusions: These analyses shed light on the functional features defining the quality of TIL infiltrates in cancer. A significant proportion of TILs across tumor types, especially non-melanoma, are bystander T cells. These results highlight the need to develop strategies focused on the tumor-reactive TIL subpopulation

    Rapid Identification of the Tumor-Specific Reactive TIL Repertoire via Combined Detection of CD137, TNF, and IFNγ, Following Recognition of Autologous Tumor-Antigens

    No full text
    Detecting the entire repertoire of tumor-specific reactive tumor-infiltrating lymphocytes (TILs) is essential for investigating their immunological functions in the tumor microenvironment. Current in vitro assays identifying tumor-specific functional activation measure the upregulation of surface molecules, de novo production of antitumor cytokines, or mobilization of cytotoxic granules following recognition of tumor-antigens, yet there is no widely adopted standard method. Here we established an enhanced, yet simple, method for identifying simultaneously CD8+ and CD4+ tumor-specific reactive TILs in vitro, using a combination of widely known and available flow cytometry assays. By combining the detection of intracellular CD137 and de novo production of TNF and IFNγ after recognition of naturally-presented tumor antigens, we demonstrate that a larger fraction of tumor-specific and reactive CD8+ TILs can be detected in vitro compared to commonly used assays. This assay revealed multiple polyfunctionality-based clusters of both CD4+ and CD8+ tumor-specific reactive TILs. In situ, the combined detection of TNFRSF9, TNF, and IFNG identified most of the tumor-specific reactive TIL repertoire. In conclusion, we describe a straightforward method for efficient identification of the tumor-specific reactive TIL repertoire in vitro, which can be rapidly adopted in most cancer immunology laboratories

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    No full text
    corecore